Picking items for experimental sets: measures of similarity and methods for optimisation

Boland, N.; Bunder, R.; Heathcote, A.

Title: Picking items for experimental sets: measures of similarity and methods for optimisation
Creator: Boland, N.; Bunder, R.; Heathcote, A.
Relation: MODSIM2013, 20th International Congress on Modelling and Simulation. MODSIM2013, 20th International Congress on Modelling and Simulation: Proceedings (Adelaide, SA 01-06 December, 2013) p. 3274-3280
Relation: https://www.mssanz.org.au/modsim2013
Publisher: Modelling and Simulation Society of Australia and New Zealand (MSSANZ)
Resource Type: conference paper
Date: 2013
Description: Experimental psychologists often conduct experiments in which subjects are exposed to sets of stimuli. For example, human subjects may be shown a sequence of written words, and their response times recorded in order to understand the effect of one attribute, such as the frequency of the word in spoken language, on human response time. The psychologists designing the experiment will construct several sets of words so that each set contains only words within a specified range of frequencies in the spoken language. To reduce the risk of bias in the experiment, the psychologists would like each set of words selected to be similar in terms of other confounding attributes that could affect response time, such as the number of letters in the word, or the number of syllables. A challenge for the psychologists is that the sets they select may need to contain many words, the words may be selected from a set of thousands, and a large number of potentially confounding attributes may need to be considered. This daunting task, which we dub the problem of Picking Items for Experimental Sets (PIES), is usually performed manually by experimental psychologists. To assist in this task, both metaheuristic and mixed integer programming (MIP) approaches have recently been developed. Such automated approaches require a systematic definition of “similarity” of sets; the degree to which sets of items are similar with respect to some attribute can no longer be assessed objectively by the psychologist designing the experiment(Forster, 2000). To illustrate this issue, consider two sets of words, B₁ and B₂, where B₁, B₂ ⊆ W, the set of words available for selection, and the attribute given by the number of letters in each word. For each word w ∈ W, let f,sub>wl denote the number of letters, l, in word w. One approach to measuring the similarity of the two sets is to compare the average value of the attribute across the sets, i.e. measure based on the difference |1/B₁∑w∈B₁ f,sub>wl - 1/B₂∑w∈B₂ f,sub>wl|. However it is well known that very different distributions can have the same average value. For example, defining the attribute value count vectors ^ηB_ic = |{w ∈ B_i : f,sub>wl = c}| for each i = 1, 2 and each positive integer value c that could be the length of a word, say ranging from 1 to 5 letters. This approach would consider two sets with ^ηB_i = (3, 3, 3, 3, 3) and ^ηB₂ = (0, 0, 15, 0, 0) to be very similar, whereas clearly the experience of a human subject to these two sets might be very different: the former has an even spread of word lengths whereas the latter has all words of identical length. The existing metaheuristics address this issue by using group characteristics, such as average or standard deviation, which take into account the relative values of the heuristics. However, as we have shown, these group characteristics do not adequately measure the similarity of the sets. Recent MIP approaches measure similarity between sets using the entire histogram, i.e. they measure based on the difference |^ηB₁c − ^ηB₂c| for each c. Whilst this provides a richer measure of similarity than simple averages, it does not take into account the relationships between attribute values. To return to the word length illustration, the length count vectors (0, 3, 3, 3, 6) and (3, 4, 5, 0, 3) are “equally” different from (3, 3, 3, 3, 3) component-wise. But it is common sense that words of length 2 or 3 are more similar to words of length 4 than words of length 1 are to words of length 5, so the vector (0, 3, 3, 3, 6) “replacing” three words of length 1 with three of length 5 is less similar to (3, 3, 3, 3, 3) than is (3, 4, 5, 0, 3), which “replaces” three words of length 4 with two of length 3 and one of length 2. The component-wise histogram measure does not take into account similarities and differences between attribute values. This paper briefly reviews the existing approaches to automate picking items for experimental sets, and then discusses new MIP approaches that address the entire distribution of attribute values across sets while also taking into account the relationships between attribute values. Numerical results on psycholinguistic data sets are analysed, and the alternative approaches compared.
Subject: mixed integer programming,; stimulus selection; factorial designs; experimental psychology
Identifier: http://hdl.handle.net/1959.13/1342271
Identifier: uon:28931
Identifier: ISBN:9780987214331
Language: eng
Reviewed

Hits: 1312
Visitors: 1297
Downloads: 0

		Thumbnail	File	Description	Size	Format